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1.  INTRODUCTION 


The  SDI  service  based  on  DDC  magnetic  tapes  and  provided  for  a  peri¬ 
od  of  a  year  was  compared  in  a  preliminary  report1  with  the  results 
obtained  in  an  experiment  with  one  preliminary  IEEE  tape  compiling 
Science  Abstracts  titles.  The  INSPEC-2  version  of  the  IEEE  Science 
Abstracts  tapes  was  used.  This  particular  format  included  titles  but 
no  abstracts,  and  offered  the  most  timely  bibliographic  record  covering 
international  journal  articles,  symposia,  and  conference  papers.  The 
timeliness  was  considered  more  important  than  the  abstracts.  Also  at 
the  request  of  our  Technical  Director  we  began  processing  tapes  that 
the  Engineering  Index,  Inc.  marketed  as  a  by-product  of  its  monthly 
bibliography. 

Beginning  with  CY  1971,  our  subscribers  in  HDL  received  five  differ¬ 
ent  SDI  bulletins  each  month.  Two  issues  contained  titles  derived 
from  the  DDC  tapes  (semi-monthly) ,  two  were  extracted  from  the  IEEE  tapes 
(semi-monthly) ,  and  one  was  derived  from  the  tape  of  the  Engineering 
Index  (monthly).  A  purge  process,2  which  we  described  in  preceding  re¬ 
ports,  transformed  the  different  tapes  into  one  HDL  standard  format  so 
that  one  retrieval  program  could  be  used  to  generate  the  five  issues  of 
SDI  bulletins,  which  varied  for  each  of  our  subscribers.  In  each 
instance  the  profiles  of  the  subscribers,  some  individuals  and  some 
teams,  were  matched  with  terminology  constituting  the  titles  and  sub¬ 
ject  headings. 

The  five  monthly  issues  covering  three  distinct  services  separately 
were  deemed  appropriate  for  various  reasons: 

(1)  The  subscribers  prefer  to  receive  brief  print-outs;  their 
brevity  facilitated  return  of  the  carbon  copies  to  the  information 
office  with  the  users'  selections  and  evaluations  in  a  relatively 
short  time. 

(2)  The  responses  from  the  subscribers  are  treated  as  requests 
and  prompted  acquisition,  circulation,  and  inter-library  loan  action; 
thus  the  workload  of  the  library  personnel  resulting  from  the  requests 
was  more  evenly  spread  in  time. 

(3)  The  statistics  kept  for  the  individual  response  permitted 
an  evaluation  of  the  selections,  of  the  three  tape  services,  and  of 
the  subscribers.  Moreover,  it  provided  some  basis  for  determining 
the  relative  cost  and  efficiency  of  each  of  the  three  tapes  services. 

The  objective  of  selecting  and  building  a  pool  of  mission-related 
bibliographic  information  from  the  source  tapes  for  future  on-line 
retrieval  operations  or  for  the  rapid  compilation  of  highly  selective 
bibliographies  on  specific  subjects  is  approached  methodically  through 
the  SDI  procedure:  first,  by  the  elimination (purge)  of  all  subject 
groups  that  are  not  concerned  with  the  mission  of  the  organization; 
second,  by  the  selection  or  matching  process  that  uses  the  subscriber' 
interest  profiles;  and  third,  by  the  evaluations  of  the  recipients  of 
the  service,  as  reflected  in  their  requests  for  the  acquisition  of 
titles  to  be  retained  in  the  library,  or  for  a  circulation  and  often 
retention  copy  of  the  selected  item. 


Altmann,  B.,  Comparison  of  HDL  SDI  Services  Based  on  a  Preliminary 
IEEE  Tape  and  on  DDC  Tapes,  TM  70-25,  Harry  Diamond  Laboratories, 
Washington,  D.  C.,  1970. 

2 

Altmann,  B.,  The  HDL  Automated  Information  System,  TR  1523,  Harry 
Diamond  Laboratories,  Washington,  D.  C.,  1970,  pp.  36-38. 
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Another  year  has  passed  since  the  preliminary  report  on  the  SDI 
system  was  issued  and  a  review  of  the  new  service  and  its  impact  on 
scientists  and  engineers  of  the  installation  as  well  as  its  contribu¬ 
tions  to  the  entire  IIDL  information  system  is  now  possible. 

2.  EVALUATION  OF  SDI  SERVICE 

To  assess  the  relative  value  of  the  three  tape  services  and  the  SDI 
system,  one  must  take  into  consideration  the  cost  of  the  subscriptions, 
the  computer  time  necessary  to  process  the  tapes  and  generate  the  serv¬ 
ice,  and  their  relative  utility.  The  computer  cost  involves  tape 
processing  in  HDL  (1)  to  eliminate  the  bibliographic  information  about 
titles  that  are  not  related  to  HDL  missions  or  aieas  of  interest  and 
(2)  to  reformat  the  remaining  entries  into  an  organization  that  will 
permit  the  application  of  the  standardized  HDL  retrieval  or  selection 
program,  (3)  matching  the  individual  interest  profiles  with  the  purged 
and  reorganized  tapes,  and  (4)  printing  the  selected  information  in 
the  form  of  individually  addressed  bulletins  tailored  to  the  profiles 
of  particular  individuals  or  teams. 

The  statistical  figures  in  the  SDI  service  do  not  cover  a  period  of 
12  months;  rather  they  were  collected  in  overlapping  periods  of  6  to  7 
1/2  months  from: 

the  Engineering  Index  tapes  for  6  months  (6  issues)  with  37,347  titles, 
the  Science  Abstracts  tapes  for  7  months  (14  issues)  with  60,397  titles, 
and  the  DDC  tapes  for  7  1/2  months  (15  issues)  with  23,924  titles. 

The  information  collected  is  assumed  to  be  representative,  and  the 
numerical  information  on  titles,  operations,  services,  and  efficiency 
was  projected,  making  the  evaluation  for  the  period  of  one  entire  year. 


Table  A.  Subscription  Cost  versus  Titles  Used 


Suppliers  Subscripton  Titles  supplied 
Cost  per  tapes 

year 


Titles  retained  Selected  from 

after  purge  purge  tape  for 

SDI  Bulletins 
Unique  All 

Titles  Titles 


($)  (number) (percent) (percent) (percent) 


DDC 

1,000  - 

38,280 

28,680 

74 

33 

55 

Sci  . 

Abs . 

3,700  - 

103,527 

67,076 

63 

20 

26 

Eng . 

Ind . 

6,800  - 

75,093 

47,846 

64 

12 

15 

The  Engineering  Index  tapes  supplied  about  two  times,  and  the 
Science  Abstracts  tapes  nearly  two  and  one  half  times,  the  number  of 
titles  supplied  by  the  DDC  tapes.  Afl^.r  the  first  computer  process, 
74  percent  of  the  DDC  entries  were  retained;  of  these,  33  percent  of 
the  titles  were  selected  for  the  SDI  bulletins.  A  number  of  titles 
was  published  in  several  bulletins,  i.e.,  was  supplied  to  several 
clients.  This  duplication  increases  the  utility  of  the  unique  titles 
to  a  higher  "percentage"  of  55  selected  and  printed  in  the  individual 
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bulletins  that  were  sent  to  the  59  subscribers.  The  comparable  fig¬ 
ures  fcr  the  Science  Abstracts  were  63,  20,  and  23  percent,  and  for 
the  Engineering  Index,  64,  12,  and  15  percent.  These  figures  reflect 
the  relative  compatibility  of  the  data  that  the  three  services  pro¬ 
vided  with  the  actual  information  requirements  of  the  installation, 
in  that  66  percent  of  the  unique  DDC  titles,  30  percent  of  the  unique 
Science  Abstracts  titles,  and  23  percent  of  the  unique  Engineering 
Index  titles  had  to  be  duplicated  in  the  various  SDI  bulletins 
distributed . 

The  true  value  of  a  service  can  be  measured  only  in  terms  of  the 
utility  of  the  titles  to  the  subscribers  as  reflected  by  subscriber 
response.  With  each  bulletin  a  carbon  copy  was  distributed,  which 
the  subscribers  were  requested  to  return  with  their  evaluations.  A 
number  of  subscribers,  however,  did  not  bother  to  return  their  car¬ 
bon  copies.  They  screened  their  bulletins  and  requested  documents  and 
papers  of  interest  directly  from  the  circulation  desk.  When  inter¬ 
viewed  about  their  attitude  and  opinions,  they  expressed  the  desire 
to  be  jtained  on  the  subscription  list  and  definitely  considered  them¬ 
selves  to  be  beneficiaries  of  the  service.  These  responses,  however, 
cannot  be  used  in  a  statistical  survey.  The  results  for  those  co¬ 
operating  are  given  in  Table  B. 


Table  B.  Relevance  Ratios  Calculated  for  the 
Returns  of  Cooperating  Subscribers 


Tape 

Service 

Titles  Published 
in  SDI  Bulletins 

Titles 

Selected 

Relevance  Ratios 

(Percent) 

DDC 

9,274 

2,504 

27 

Sci. 

AbS. 

7,107 

3,221 

45 

Eng . 

Ind . 

3,315 

1,518 

45 

19,696  (Total)  7,243  (Total)  36  (Average) 


The  relevance  ratio  for  the  DDC  bulletins  appears  to  be  50  percent 
smaller  than  both  the  ratios  of  the  Science  Abstracts  and  Engineering 
Index  Bulletins.  Concerning  this  large  discrepancy  the  question  was 
raised  whether  the  organization  by  subject  groups  in  the  DDC  tapes  was 
less  appropriate  for  processing  by  the  HDL  computer  program  or  whether 
the  selection  performed  by  the  subscribers  was  not  sufficiently  exhaus¬ 
tive.  To  resolve  the  latter  question  the  professional  staff  of  STINFO 
screened  all  the  corresponding  DDC  Bulletins  (Government  Reports 
Announcements  and  Technical  Announcement  Bulletins)  for  pertinent 
titles  that  had  not  been  identified  in  feedback  from  the  SDI  sub¬ 
scribers  . 

In  addition  to  the  1214  unique  titles,  a  number  corresponding  to 
the  2504  overlapping  ones  which  the  67  percent  cooperating  subscribers 
have  selected,  this  manual  screening  and  selection  process  added  an¬ 
other  1265  titles, thus  more  than  doubling  the  number  of  relevant  titles. 
It  can,  therefore,  be  assumed  that  the  relevance  ratio  of  bulletins 
derived  from  the  DDC  tapes  cou)d  increase  to  a  minimum  of  44  percent, 
provided  a  comprehensive  HDL  profile  will  be  applied  and  the  print¬ 
out  of  the  corresponding  titles  will  reach  the  engineers  and 
scientists  who  are  responsible  for  their  contents. 


9 


Time  limitations  and  a  missing  link  between  titles  on  tapes  and  those 
in  the  published  (Science  Abstracts)  bulletins  precluded  a  manual  supple¬ 
mentation  of  pertinent  titles  for  the  two  other  services.  It  could, 
therefore,  not  be  established  whether  the  relevance  ratios  of  the  other 
two  services  can  be  similarly  improved. 

The  computer  costs  for  the  SDI  service  are  listed  in  Table  C. 

These  have  been  extrapolated  to  yearly  costs  in  Table  D. 


Table  C.  Computer  Costs  (per  1000  Titles)  of  the  HDL  SDI  Service  Derived 

From  Three  Different  Tapes 


Tape  Services 

Purge 

Process 

Selection 

Process 

Printing  of 
Bulletins 

(min) 

($) 

(min) 

($) 

(min)  ($) 

DDC 

43.2 

112 

24.4 

64 

18.9  49 

Eng.  Ind. 

13.7 

36 

15.9 

41 

4.9  13 

Sci.  Abs. 

5.4 

14 

15.9 

41 

5.2  14 

Without  any  doubt  the  format  of  the  Science  Abstracts  tapes  is  most 
appropriate  for  the  present  HDL  computer  hardware  as  well  as  computer 
programs.  The  programs  used  for  processing  the  DDC  tapes  are  currently 
being  revised  to  reduce  the  relatively  high  cost. 


Table  D.  Yearly  Computer  Cost  of  the  HDL  SDI  Service 


Suppliers 
of  Tapes 

Purge 

Process 

Selection 

Process 

Printing  of 
Bulletins 

Total 

Cost 

(min) 

($) 

(mm) 

($) 

(min) 

($) 

($) 

DDC  (24  issues) 

1655 

4303 

700 

1820 

116 

303 

6426 

Engr.  Ind.  (12  issues) 

1034 

2688 

765 

1989 

38 

99 

4776 

Sci.  Abs.  (24  issues) 

560 

1457 

1072 

2786 

94 

245 

4488 

If  we  relate  the  total  (subscription  and  computer)  cost  to  the  num¬ 
ber  of  titles  that  the  subscribers  selected,  the  prices  are  $2.96  for 
each  DDC  title,  $2.54  for  each  Science  Abstracts  title,  and  $7.62  for 
each  Engineering  Index  title.  For  59  subscribers*  the  service  distrib¬ 
utes  295  individual  bulletins  each  month  or  3540  bulletins  per  year. 

The  average  cost  per  bulletin  on  this  basis  approximates  $7.68.  If  we 
take  into  consideration  that  the  bulletins  are  also  screened  by  teams 
to  reach  a  total  of  135  staff  members,  the  cost  per  individual  sub¬ 
scriber  will  be  reduced  to  $3.35.  For  this  price,  reference  librarians 
could  not  render  a  similar  service,  even  if  one  acknowledges  that 
selections  from  some  given  tapes  will  not  be  successful,  and  some 
individual  bulletins  may  not  be  distributed.  Neverless,  the  computer 
cost  of  the  current  services  must  be  reduced;  it  is  hoped  that  more 
appropriate  programs  will  improve  the  efficiency,  in  particular  for 
the  service  derived  from  the  DDC  tapes. 


* 

At  present  the  number  has  risen  to  63. 
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Whenever  warranted  by  deficiencies  in  current  bulletins,  an  effort 
must  be  made  in  certain  cases  to  assure  a  more  exhaustive  recall  of 
pertinent  titles  than  the  matching  process  with  the  few  specific  terms 
provided  by  a  profile  formulation  will  permit.  We  have  selected  from 
the  tape  of  the  Engineering  Index  Thesaurus  the  HDL  mission-related 
terminology  and  are  therefore  prepared  to  introduce  broader,  narrower, 
or  related  terms  into  the  profiles  without  manual  effort. 

An  important  objective,  still,  is  a  more  exhaustive  exploitation  of 
the  tape  services  we  receive  to  collect  a  comprehensive  data  file  to 
support  future  bibliographic  requirements.  For  this  purpose  the  HDL 
information  office  is  preparing  interest  profiles  going  beyond  the  scope 
of  those  provided  by  our  subscribers.  These  profiles  are  based  on  the 
information  furnished  by  DD  1498  (Research  and  Technology  Work  Unit 
Summary) ,  DD  1634  (Research  and  Development  Planning  Summary) ,  and  DA 
3664-R  (Research  and  Development  Planning  Summary  Army  R&D  Management 
Data) ,  which  relate  in  greater  detail  the  scope  of  the  current  HDL  and 
HDL-related  scientific  and  technical  efforts. 

While  the  combination  of  ephemeral  user  requirements  with  a  mission- 
concerned  HDL-wide  profile  should  enhance  the  recall  capability  of  the 
system  and  the  strength  of  the  HDL  service  as  a  reference  center,  efforts 
must  also  be  exerted  to  improve  the  relevance  ratios  of  the  SDI  bulletins 
and  of  the  future  bibliographies  to  be  derived  from  the  installation's 
growing  data  file.  The  test  results  indicate  the  necessity  of  such  an 
improvement . 

The  cooperating  segment  of  our  subscribers  whose  feedback  provided 
the  statistical  data  for  this  study  represent  67  percent  of  all  sub¬ 
scribers,  each  of  them  receiving  an  average  of  500  titles  during  one 
year.  The  minority  of  33  percent  that  was  not  inclined  to  communicate 
their  own  selections  and  evaluations  to  the  personnel  operating  the 
service  received  an  average  of  1100  titles.  This  result  makes  it 
imperative  to  pay  attention  to  a  possible  relationship  between  bulletin 
size  and  subscriber  cooperation. 

On  the  basis  of  the  statistical  information  derived  from  the  returns 
made  by  the  so-called  cooperative  group,  it  appears  that  the  average 
bulletin  processed  by  members  of  this  group  comprised  eight  titles.  The 
comparable  size  of  those  received  by  the  group  that  did  not  return  their 
bulletins  regularly  was  18.  Although  the  difference  in  numbers  of  titles 
is  not  so  great  to  deter  fully  occupied  engineers  and  scientist  from 
cooperating  with  the  library,  we  will  attempt  to  refine  the  profiles  more 
freouently,  and  furthermore,  explore  whether  smaller-size  bulletins  will 
not  only  result  in  better  cooperation,  but  will  also  induce  a  greater 
number  of  the  professional  staff  to  subscribe  to  the  service. 

Any  type  of  compulsion  is  contrary  to  HDL  policy.  All  subscriptions 
and  all  communications  are  voluntary  and  primarily  motivated  by  the 
personal  advantage  of  the  subscriber. 

To  refine  the  selectivity,  in  addition  to  the  present  regular  review 
and  adjustment  of  the  subscribers'  profiles,  the  programs  are  being 
changed  to  introduce  weight  factors  which  the  indexers  of  the  three  tape- 
producing  reference  centers  provided  in  different  ways:  DDC  by  adding 
asterisks  to  terms  considered  representative  of  the  main  contents  of  the 
paper  or  document,  and  Science  Abstracts  and  Engineering  Index,  by  assign¬ 
ing  a  broader  heading  under  which  the  descriptive  terms  and  phrases  or  the 
analysts  are  placed  in  the  index.  Furthermore,  the  number  representing 
the  subject  group  of  the  printed  bulletin  to  which  the  analyst  assigns  a 
title  will  be  used  in  our  effort  to  limit  the  output  in  response  to  a 
particular  profile. 


3. 


INTEGRATED  INFORMATION  PROGRAM 


Within  the  overall  system  of  our  information  office  the  SDI  program  has 
exceeded  its  original  purpose,  i.e.,  the  selective  dissemination  of  current 
pertinent  literature.  It  contributes  significantly  to  our  selection  and 
acquisition  activities  and  serves  as  a  vehicle  for  accumulating  the  mission- 
related  information  that  must  be  made  available  to  fuide  future  operations 
of  the  installation  intelligently  and  efficiently,  in  fulfilling  its 
responsibilities  as  a  lead  laboratory  and  reference  center.  It  has  been 
recognized  that  the  SDI  service  could  also  be  used  to  automate  the  cata¬ 
loguing  activity.  The  realization  of  this  possibility  has  been  made  the 
subject  of  a  continuing  study. 

Recently  another  project  has  received  a  still  higher  priority.  This 
project  has  the  objective  of  informing  the  design  engineer  about  current 
objectives  in  materials  and  components  research  and  development  so  that 
he  can  consider,  in  current  designs  of  electronic  circuits,  the  most  ad¬ 
vanced  methods  and  products  that  can  be  utilized  in  the  production  phase. 

To  do  this,  the  potential  of  components  currently  under  study  and  develop¬ 
ment,  and  their  anticipated  characteristics  and  parameters  must  be  known. 
Present  plans  are  to  develop  access  to  relevant  parameters  information 
as  part  of  the  HDL  SDI  operations.  Such  access  is  presently  not  provided 
by  information  centers,  because  they  limit  their  indexing  effort  to  the 
subject  content  of  the  reports  and  papers  and  completely  omit  numerical 
data.  This  limitation  also  curtails  the  usefulness  of  the  GIDEP  (Government 
Industry  Data  Exchange  Program)  which  is  concerned  with  on-shelf  or 
completed  items.  Although  the  HDL  project  will  net  cover  GIDEP  reports, 
it  must  be  related  to  the  future  indexing  efforts  of  this  much  more 
comprehensive  service,  and  attempts  will  be  made  to  coordinate  studies 
and  solutions  with  those  that  GIDEP  might  offer  at  a  later  time  to  cope 
with  the  parameter  problem. 

In  order  to  establish  a  parameter  information  center  for  airborne- 
electronic  engineering  applications,  the  HDL  Information  Office  must  ex¬ 
tract  ana  analyze  the  pertinent  publications  (reports,  articles,  etc.), 
because  the  information  centers  have  net  only  excluded  numerical  data 
from  indexing,  but  also  do  not  record  all  useful  or  necessary  data  in  the 
abstracts  on  their  tapes.  In  many  instance,  the  office  will  have  to  re- 
auest  the  appropriate  information  directly  from  the  sources  on  specially 
designed  Questionnaires  as  soon  as  relevant  pro  jects  have  been  announced 
through  DD  Form  1498,  DD  form  1634,  and  DA  Form  3664-R,  and  the  Status 
Reports  on  Advanced  Electron  Device  Technology,  issued  for  the  Office 
of  the  Director  of  Defense  Research  and  Engineering,  or  through  the 
specialized  tape  service  to  be  acquired  from  the  Smithsonian  Science 
Information  Exchange. 

For  the  retrieval  of  pertinent  data  from  the  abstracts  on  the  tapes, 
a  conceptual  approach  is  recommended  that  can  be  described  as  a  reverse 
form  of  our  ABC  method.3  This  method  was  conceived  to  eliminate  the 
tremendous,  if  not  impossible,  task  of  producing  and  maintaining  a 
complete  thesaurus  of  millions  of  terms  and  phrases  annotated  to  identify 
semantic  and  grammatical  peculiarities  and  syntactical  requirements, 
the  preconditions  not  only  for  retrieval  operations  but  also  for  auto¬ 
mated  string  analysis,  textual  standardization,  and  computer  translations. 
The  HDL  method  generates  as  a  by-product  a  compilation  of  the  entire 
processed  and  continuously  accumulating  terminology  in  groups  identified 
by  functions  or  relations.  It  is  an  extension  of  Jost  Trier 's*  concept 


3Altmann,  B.,  The  HDL  Automated  Information  System,  TR-1523,  Harry 
Diamond  Laboratories,  Washington,  D.C.,  1970,  pp.  36-38. 

uAltmann,  B.,  and  Walter  Riessler,  Automation  of  ABC  System,  TR  1392, 
Part  I,  Linquistic  Problems  and  Outline  of  a  Protype  Test,  Harry  Diamond 
Laboratories,  Washington,  D.  C.  1968,  p.  17. 
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that  any  person  attempting  to  communicate  a  hunch  or  idea  must  throw  a 
net  (i.e.  a  loose  structure  of  thoughts)  over  an  initially  hazy  intui¬ 
tion  to  catch  its  contents  or  meaning  before  representing  it  in  the 
customary  and  precise  symbols  of  a  given  language.  A  test  proved  that 
the  utilization  of  this  method  for  processing  or  indexing  will  also 
create  various  word-independent  storage  and  retrieval  systems. 

In  our  task  of  extracting  parameter  data  from  tapes  we  cannot  work 
with  a  standardized  indexing  terminology,  but  must  deal  with  a  great 
variety  of  linguistic  forms  in  which  parameters  can  be  communicated  and 
are  in  fact  recorded  on  the  different  tapes.  It  is,  therefore,  planned 
to  construct  typical  combinations,  nets  in  Trier's  terminology,  to 
catch  the  parameter  data  and  transform  them  into  a  standard  format.  We 
have  started  to  collect  the  linguistic  forms  and  expressions  for  para¬ 
meters  from  different  abstracts  and  texts.  At  this  time  the  terms  for 
measuring  units  such  as  "ohms,"  "volts,"  "farads,"  etc.  appear  to  be 
the  most  efficient  symbols  that  can  be  applied  to  trigger  a  comprehen¬ 
sive  automatic  printout  of  the  desirable  parameter  data.  Because  these 
names  for  measured  units  will  frequently  be  introduced  by  prefixes  such 
as  "mega,"  "kilo,"  "milli,"  "giga,"  etc.,  to  define  particular  magnitudes 
a  truncation  to  the  left  and  to  the  right  appears  to  be  a  logical  answer. 
Although  the  method  is  feasible  according  to  a  preliminary  analysis  and 
estimate,  it  is  admitted  that  the  efficiency  of  this  idea  must  still  be 
proven  in  test.  In  addition, various  prepositional  and  verbal  combina¬ 
tions  are  being  considered  to  formulate  succinct  and  effective  retrieval 
profiles . 

Although  the  automatic  process  of  retrieving  pertinent  data  from 
tapes  should  provide  access  to  a  great  number  of  indexed  reports  and 
papers,  it  is  anticipated  that  the  texts  themselves,  reports  and 
papers  in  addition  to  the  answers  to  our  questionnaires,  will  have  to 
be  consulted  and  analyzed  for  adequate  selection  and  evaluation  in 
many  instances. 

One  product  of  the  service  will  be  a  card  catalog, arranged  alpha¬ 
betically  by  components,  devices,  and  materials,  each  of  them  sub¬ 
divided  by  the  names  of  the  parameters  and  their  numerical-  ranges  in 
ascending  order.  The  location  symbol  of  the  source  document  will 
guide  the  engineer  to  the  more  detailed  information.  An  SOP  on  the 
arrangement  and  the  presentation  of  the  numerical  information  has 
been  previously  prepared  and  tested  in  connection  with  our  ABC  pro¬ 
ject  . 

A  second  product  of  the  project  will  be  a  KWIC  list  of  the  descrip¬ 
tions  (i.e.,  parameters)  of  all  the  items  in  the  information  system. 

The  required  specific  parameter  data  for  a  particular  component  or  de¬ 
vice  can  therefore  be  found  easily  and  evaluated  in  context  with  all 
its  other  parameter  data. 


While  the  inclusion  of  the  parameter  information  service  will  broaden 
and  intensify  the  activities  of  the  HDL  reference  service  in  so  far  as  a 
renewed  re-evaluation  of  existing  indexing  and  retrieval  methods  will  be¬ 
come  necessary,  the  present  SDI  service  will  also  be  subjected  to  a 
critical  review  towards  aiding  this  effort.  More  appropriate  computer 
programs  are  being  written  to  improve  the  ef f iciency, in  particular,  for 
the  service  derived  from  the  DDC  tapes. 

Whatever  improvements  result  from  these  changes  and  other  experiments, 
we  cannot  expect  to  achieve  services  that  will  greatly  exceed  those  of  a 
good  subject  card  catalog.  We  would  deceive  ourselves  if  we  assumed  that 
any  automated  system  using  combinations  of  terms  or  subject  headings  with 
or  without  statistical  manipulation  can  overcome  the  inherent  limitations 
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of  such  a  system.  Even  tests  construed  to  prove  the  opposite  are  not  con¬ 
vincing  because  they  are  based  on  a  defective  understanding  of  the  thinking 
process  and  of  the  nature  of  language.  The  root5  of  the  trouble  is  that 
present  automated  systems  cannot  handle  natural  language  and,  when  artifi¬ 
cial  languages  must  be  used,  the  systems  tend  to  create  as  many  problems 
as  they  resolve.  There  is  some  confusion  on  this  point  because  existing 
systems--'‘keywords”  systems,  for  instance--can  cope  with  isolated  words. 

But  as  any  student  of  a  foreign  language  has  discovered,  recognition  of 
isolated  words  is  not  at  all  the  same  as  handling  natural  language.  We 
can  in  this  connection  only  point  to  the  innumerable  ways  one  thought  or 
concept  can  be  formulated  linguistically,  and  to  the  many  different 
meanings  one  word  will  assume  in  different  contexts.  The  functional  role 
of  the  preposition,  for  example,  is  not  determined  by  logic,  but  by 
etymology,  and  long  established  custom  or  rule.  A  project  governed  by  a 
mechanistic  concept  that  words,  the  symbols  shaped  and  changed  during  the 
long  history  of  people  and  nations,  can  be  put  together  like  a  jig-saw 
puzzle  is  probably  destined  to  end  in  failure.  The  much  more  complex 
project  of  computer-assisted  translations  that  the  Air  Force  has  operated 
cffers  an  outstanding  example.  The  selection  of  books  was  excellent,  but 
our  scientists  and  engineers  who  received  the  translations  frequently 
requested  new  ones  because  the  English  texts  were  distorted  and  did  not 
make  sense. 

4 .  CONCLUSION 

The  trend  toward  mechanistic  systems  rather  than  toward  conceptual  and 
therefore  more  efficient  and  successful  information  systems  has  been  dis¬ 
heartening.  Therefore,  it  is  gratifying  to  see  at  least  two  projects  (see 
supplements  A  and  B)  in  which  the  basic  principle  of  encoding  well  defined, 
and  consequently  meaningful,  statements  or  descriptors  for  retrieval  has 
been  revived.  _r  both  projects  persist,  tne  results  should  contribute  to 
a  better  understanding  of  the  importance  of  the  linguistic  element  in 
documentation  and  of  its  philosophical  or  psychological  peculiarity;  I 
believe  the  results  could  further  reflect  negatively  on  the  efficiency  of 
statistical  and  mathematical  methods  and  techniques.  A  departure  from 
mechanistic  and  atomistic  concepts  concerning  language  and  the  introduc¬ 
tion  of  a  humanistic  approach  must,  in  order  to  succeed,  be  accompanied 
by  a  management  policy  that  insists  upon  integration  of  all  efforts  into 
an  overall  system.  If  arguments  for  economy  are  allowed  to  restrict  the 
scope  of  an  individual  project  to  very  narrowly  defined,  isolated  tasks, 
these  in  the  end  must  turn  out  to  be  expensive,  short-lived,  and  largely 
wasteful.  A  restricted  SDI  activity  presents  a  good  example.  It  will 
always  be  a  relatively  costly  enterprise,  whatever  the  number  of  its 
subscribers  may  be,  unless  its  functions  are  broader  than  the  production 
of  bulletins.  Logical  additional  functions  are:  support  of  the  acquisi¬ 
tion  policy  of  an  installation,  generation  of  the  data  bank  from  which 
mission-related  efforts  can  receive  substantial  information  on  the  entire 
available  state  of  the  art,  and  assistance  in  required  specialized 
analytical  services  not  provided  by  national  or  professional  documenta¬ 
tion  centers. 


5 

Bross,  Irwin,  Roger  Priore,  and  others,  Feasibility  of  Automated 
Information  System  in  the  Users'  Natural  Language,  American  Scientist, 
57,  No.  2,  1969,  p.  195. 


SUPPLEMENT  A 


HEALTH  ELEMENT  ASSESSMENT  PROJECT 

The  purpose  of  the  Health  Element  Assessment  Project6  is  the  develop¬ 
ment  of  standardized  syntactical  statements  describing  the  contracts  or 
grants  of  the  Health  services  and  Mental  Health  Administration  Department 
(HEW)  and  their  objectives,  in  such  a  way  that  all  elements  of  the  state¬ 
ment  can  be  made  subject  to  a  computerized  retrieval  operation.  For 
guidance  of  the  analysts  who  generate  the  input,  a  work  sheet  in  matrix 
format  has  been  designed  to  assume  a  standard  sequence  of  contractor  or 
grantee,  the  verb  expressing  his  activity,  the  noun  representing  the 
substance  or  direct  objective  of  the  contract,  the  indirect  objectives 
(several  possibilities),  the  geographic  limitations  of  the  effort,  and 
finally  the  procedure  (method  or  technique) .  Each  of  the  noun  entries 
mentioned  can  be  preceded  by  an  adjective  (or  qualifier);  and  each  noun 
in  the  string  following  the  verb  can  be  introduced  by  a  link  to  establish 
the  accurate  relationship  between  the  subsequent  nouns.  Furthermore, 
each  noun  entry  can  be  a  compound  of  up  to  four  terms  (nouns  or  adjectives)  . 
Great  importance  is  attached  to  the  verb  because  each  verb  used  in  a 
description  requires  its  own  string,  and  work  sheet  even  if  all  other 
entries  are  identical.  Some  preparations  have  been  made  to  restrict  the 
choice  of  the  verbs. 

The  ABC  system  differs  from  this  project  in  various  respects.  It 
always  nominalizes  the  verb,  thereby  frequently  combining  it  with  the 
basic  method  or  technique  of  the  effort;  it  distinguishes  between  noun 
and  qualifiers  and  avoids  the  confusion  when  adjectives  and  nouns  are 
used  as  qualifiers  and  as  elements  of  a  compound  phrase;  it  does  not 
encode  prepositions  or  prepositional  phrases  because  the  use  of  pre¬ 
positions  is  not  governed  by  logic  but  by  etymology  and  grammatical  rule. 

It  identifies,  however,  functions  and  relationships,  and  makes  it  clear 
whether  they  refer  to  the  preceding  element  (or  entry)  or  to  the  main 
subject  of  the  string. 

It  will  have  to  be  demonstrated  by  test  whether  the  project  in  its 
present  form  will  facilitate  satisfactory  retrieval  of  specific  aspects 
included  in  the  descriptive  strings  of  the  Health  Elements  and  whether 
the  system  can  be  expanded  to  control  and  standardize  the  accumulating 
data . 

Seemingly  the  author  of  the  project  does  not  consider  the  automatic 
construction  of  syntactical  thesauri  and  of  more  sophisticated  retrieval 
methods  because  they  are  not  required  in  this  particular  service. 


Lowell,  D.  James,  Health  Element  Assessment  Manual  (Preliminary), 

Office  of  the  Administrator,  Health  Service  and  Mental  Health  Administration, 
Department  of  Health,  Education,  and  Welfare,  Rockville,  Maryland,  1971. 


SUPPLEMENT  B 


PRECIS  SYSTEM 

The  PRECIS7  as  well  as  the  ABC  methods,  attempt  to  construct  syntactical 
descriptors  with  the  assistance  of  a  computer.  D.  Austin  at  first  relied 
to  a  great  extent  on  the  sequence  of  the  individual  key  words  within  the 
string  for  tne  identification  of  their  relationships  to  the  main  subject 
term  as  well  as  to  each  other,  but  he  soon  recognized  that  this  approach 
as  well  as  distinctions  between  generic,  .ttributive,  possessive,  and 
cause-effect  relationships  did  not  eliminate  ambiguities.  He,  therefore, 
developed  a  system  of  "interconcept  links,"  and  for  this  purpose  introduced 
16  symbols  called  "relational  operators"  that  were  to  designate  functions 
such  as  form  and  audience  of  publication,  viewpoint,  discipline,  environ¬ 
ment,  etc.  These  operators  are  relatively  broad  and  are  not  always 
clearly  delineated.  It  might,  e.g.,  be  difficult  to  distinguish  between 
"study  region"  and  "environment"  as  demonstrated  by  his  classification  of 
"Great  Britain"  as  environment  rather  than  "study  area."  Furthermore  the 
introduction  of  "Concept  Codes"  and  "Term  Codes"  have  increased  the 
difficulty  of  the  analysis.  The  list  of  "Precis  Entries"  attached  to  the 
report  seems  to  point  to  the  fact  that  greater  emphasis  is  placed  on  the 
standardization  of  lead  concepts  or  keywords  under  utilization  of  UDC  codes 
rather  than  on  the  standardization  of  functions  and  relations.  For  the 
clarification  term  relationships, the  author  found  it  necessary  to  introduce 
"interconcept  links"  in  natural  language  in  addition  to  his  relational 
operators  (codes) . 

The  ABC  system  is  not  concerned  with  interlinkage  phraseology  during 
the  analytical  process,  but  has  provided  standardized  prepositions  and 
other  connectors  that  are  added  by  the  computer  during  program  execution. 
The  analyst  can  concentrate  on  the  subject  matter  when  he  answers  the 
standard  input  (form)  questionnaire;  and  program  and  computer  assure 
standard  sequence  and  combinations  of  terms  and  phrases  within  the  string, 
generate  separate  thesauri  by  encoded  functions  and  relations,  and,  with 
the  data  file  thus  formatted  in  the  memory,  facilitate  standardization  of 
future  inputs  and  operation  of  different  types  of  retrieval  systems 
including  an  automated  question-answering  service. 


Austin,  Derek  and  Peter  Butcher,  PRECIS,  A  Rotated  Subject  Index  System 
(and  Supplement),  London,  Council  of  the  British  National  Bibliography  1969, 
87  (17). 
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